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[57] 



ABSTRACT 



'Ilie present invention facilitates the transmission of a file to 
a computer where the receiving computer has a file (called 
the old file) that is related to the file being transmitted (called 
the new file) but where the sending computer docs not know 
the status or content of the old file. As a preliminary step, one 
of the computers generates a Delimiter Selection Profile 
Table (DSPT). The DSPT is generated by first determining 
the number of times each delimiter in a set of delimiters 
appears in the file and the distance between the locations of 
the delimiters in the file. Next using the information in the 
DSPT one of the delimiters is chosen as the delimiter which 
will be used and this delimiter is transmitted lo the computer 
which did not generate the DSPT. The receiving computer 
next generates a Segment Profile (SPT) of the old file and the 
sending computer generates an SPT the new file. The SPT is 
generated by calculating a hash code (such as a CRC) for 
each segment which is defined by the selected delimiter. The 
hash codes from the old file are transmitted to the sending 
computer. The sending computer then sends to the receiving 
computers those segments in the new file that do not have a 
hash code number which matches one of the hash code 
numbers from the old file. The sending computer also sends 
an indication of where segments from the old file should be 
inserted into the new file. The receiving computer then 
constructs the new file from the segments received from the 
appropriate old segments. 

7 Claims, 6 Drawing Sheets 
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INITIALIZE TABLE TO 0 



INITIALIZE FILE POINTER TO 0 



SET DELIMITER LENGTH TO 2 



OPEN FILE TO BE ANALYZED 



POSITION FILE TO FILE POINTER 
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READ IN TWO BYTES 
IS THIS END OF FILE? 



YES 



NO 
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USING TWO BYTES READ AS INDEX 
ACCESS ONE LINE OF DSP 



314 



UPDATE DATA IN ACCESSED LINE OF DSP 

A) INCREMENT NUMBER OF OCCURANCES 

B) COMPUTER LENGTH OF SEGMENT 

C) IF LENGTH IS LONGEST PUT IN 
LONGEST LENGTH COLUMN 

D) UPDATE OFFSET TO PREV . OCCUR 
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INITIALIZE 

A) CLEAR SEGMENT PROFILE TABLE TO BLANK 

B) SET SEGMENT TABLE SEGMENT NUMBER TO 0 
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FILE TRANSFER METHOD AND an indication of where segments from the old file should be 

APPARATUS UTILIZING DELIMITERS inserted into the new file. The receiving computer then 

constructs the new file from the segments received from the 
This is a Continuation of application Sen No. 08/176, appropriate old segments. 



950, filed Jan. 3, 1994 now abandoned. 

FIELD OF THE INVENTION 



BRIEF DESCRIPTION OF THE DRAWINGS 



FIG. 1 is an overall block diagram of the computer 

The present invention relates to electronic computers and systems, 

more particularly to the transfer of files between computers. FIG. 2A to 2C are tables which show a simple example of 

how a delimiter divides a file into segments. 

BACKGROUND OF THE INVENTION is a simple example of a Delimiter Selection 

There are a wide variety of commercially available com- Profile lable. 

puter programs which facilitate transferring files between FIG. 3 is a (low diagram showing how the Delimiter 

computers utilizing modems and telephone lines. Among the 15 Selection Profile Table is generated, 

commercially available program which provide such func- FIG. 4 is a table showing how segments of a file fit 

tions are: "crosstalk" marketed by DCA Corp of Atlanta, together. 

Ga.; "QModem" marketed by Forgin Inc. of Cedar FaUs flG. 5 is an example of a Segment Profile Table. 

Iowa; and, "Close-Up" marketed by Norton Lambert Corp _ ^-n.. o r»ci 

of Santa Barbara. Calif. 20 ^ » '^'^8"'" ^^owing how a Segment Profile 

Table is generated. 

The physical characteristics of normal telephone Imes 

limit the transmission speed which can be used to transmit DETAILED DESCRIPTION OF A PREFERRED 

data over such lines. In order to shorten the time required to EMBODIMENT 

transmit data, various types of data compression and error „ 

■„ 25 An overal diagram of interconnected computers that can 

correcting protocols are in widespread use. , , ... , • r^,^ ^ * 

. . , ^, ^ „ be used to practice the mvention are shown in FIG. 1. A 

Where a system transmittmg a data file from a first sending computer 1 is connected to a receiving computer 2 

computer to a second computer is merely updating a file ^^^^^^ 3 4 ^ telephone line 5. 

which has been previously transmitted to the second „ ^, , 

computer, the transmission time can be shorted by merely , Computer 1 has an mtemal memory U which has stored 

transmitting information which covers the "differences" ^° therem a number of programs and files. It is noted that while 

between the present file and the previously transmitted file. proff^ms and file are shown as being in the memory of 

This technique of only transmitting the "delta" between two computer l a substantial part of these program could 

files is only applicable in situations where the sending alteraatively be on conventional storage devices such as on 

system knows the state of the file at the receiving station. "^f programs and files are storxid is not 

. ^ . 35 relevant to the present invention and it can be in any 

-nie present invention provides a techmque for transmit- conventional manner, 

ting files between computers where the computer receiving ^ - , . 

the information has a file related to the file being transmitted. , ^""^P"'^' '"^"'"'y " includes an operating system 

. . . J- , t „, ^.«t^ 14, a communication program 15, a nle manipulation pro- 

but where the sending computer does not know the state or * i-, * , . ^- rJ, . 

fii^ «^««..«^^ gram 16, a new fi e 13 and other files 12. The operatmg 

the nle at the receiving computer. I ^ . . , . r^^^ . . 

system 14 could be the DOS operating system that is 

SUMMARY OF THE INVENTION marketed by Microsoft Corporation and the communications 

program 15 can be a conventional communication program. 

The present invention facilitates the transmission of a file -j^e new file 13 is the file which computer 1 would like to 

to a computer where the receiving computer has a file (called transmit to computer 2. The file manipulation program 16 is 

the old file) that is related to the file being transmitted (called 45 the program which implements the main parts of the present 

the new file) but where the sending computer does not know invention. 

the status or content of the old file. With the present The computer 2 is substantially identical to the computer 

invention as a preliminary step one of the computers ^ -^^^^^^^ 21, operating system 24. file 

generates a Delimiter Selection Profile Table (DSPT) Either jnanipulation program 26, and communication program 25. 

the receiving computer generates a DSPT of the old file or 50 components are similar to the corresponding corapo- 

the sending computer generates a DSPT of the new file. The ^^^^ computer 1. The old file 23 is a file that is related 

DSPi is generated by first determming the number of times f^j^ ^3 .j^^^ „f f,,^ 23 are similar 

each delimiter in a set of delimiters appears in the file and ^3 ^^^^^ ^^^^^ j^^^^^ jj^^^ 

the distance between the locations of the delimiters in the represents the copy of file 13 which is transmitted to 

file. Next using the mformation m the DSPT one of the 55 computer 2 using the present invention. The preferred 

deliir|iters IS chosen as the dehmiter which will be used and embodiment of the invention described herein transfers a 

this dehmiter is transmitted to the computer which did not ^3 ^^^^ computer 1 to computer 2 using the 

generate the DSPT. The receiving computer next generates following major steps. 

a Segment Profile (SFT) of the old file and the sending -..01 j uci 

. civr .u ci -ru cn-r * ^« a) One of the files on Computer 2 IS designated as old file 

computer generates an SPT the new file. The SPT is gen- 23 

e rated by calculating a hash code (such as a CRC) for each 

segment which is defined by the selected delimiter. The hash ^) ^n computer 1 is analyzed and a Delimiter 

codes from the old file are transmitted to the sending Selection Profile Table (DSPT) is generated, 

computer. The sending computer then sends to the receiving c) A delimiter is selected based on the information in the 

computers those segments in the new file that do not have a 65 DSPT. 

hash code number which matches one of the hash code d) Both the new and the old file are divided into segments 

numbers from the old file. The sending computer al.so sends based on the selected delimiter and the segments are 
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analyzed and a Segment Profile Table (SPT) is gener- 
ated for both the old and the new files. A hash number 
for each segment is generated. 

e) 'lliose segments in the new file which do not have a 
hash number corresponding to the hash number of a 5 
segment in the old file are sent to the receiving 
computer, and 

f) The receiving computer combines the parts of the new 
file that were transmitted with segments from the old 
file which have hash numbers corresponding to seg- 10 
ments in the new file to construct a replica or copy of 
the new file at the receiving computer. 

Each of the above major steps will now be explained in 
detail as will their purpose and how they are carried out. 
FIG. 2A to 2D are diagrams that facilitate an explanation of 35 
what constitutes a delimiter It is noted that these are 
simplified examples and they do not show the delimiter used 
in the preferred embodiment of the invention shown and 
described herein. ^Fhe actual delimiter used in the preferred 
embodiment of the invention will be explained later. 20 

Each of the FIGS. 2A to 2C show a file which has the 
characters: 

AGZEDCRFVZERTEGEBZRGYUJN 
E X I ZQ 

FIG. 2A shows dividing the file with the delimiter "Z". As 25 
shown in FIG. 2 A the delimiter Z divides the file into five 
segments. The first segment Ls "A G" the second segment is 
"E D C E F V" etc. The length of the segments is 2, 6. 7, 9, 
and 1. 

FIG. 2B shows dividing the file with the delimiter "E". In 30 
this case the file is divided into 6 segments. FIG. 2C shows 
dividing the file with the delimiter "G". As can be seen by 
FIGS. 2A, 2B and 2C the number of segments in a file and 
the length of the segments depends on which delimiter is 
selected to divide the file. 35 

FIG. 2D shows a Delimiter Selection Profile Table 
(DSPT). The results of using the delimiters "Z", "E", and 
"G" as shown in FIG. 2A, 2B and 2C has been entered into 
the DSPT in FIG. 2D. l^e DSPT shows the number of 
occurrences of each delimiter in the file, the longest length 40 
segment for the delimiter and offset to the previous occur- 
rence of the delimiter. The offset is what is used to calculate 
the length of the segment. 

It is noted that the typical computer file consists of eight 
bit bytes. Thus if any one of the possible combinations of the 45 
eight bits in a byte is taken as a delimiter there are two 
hundred and fifty six possible one byte delimiters. It is noted 
that the example in FIG. 2A to 2D shows alphabetic char- 
acters as delimiters, and there are only twenty six of such 
characters; however, the eight bits in each byte of a com- 50 
puter file can also have other configurations than the con- 
figurations that give the twenty six alphanumeric characters. 
In fact as previously explained there are two hundred and 
fifty six possible configurations of the eight bits in a single 
byte. 55 

A delimiter can also be a particular two byte combination. 
In fact it could also be a particular three byte combination. 
If two byte combinations are used as delimiters there arc 
65,536 possible delimiters. If three byte combinations arc 
used as delimiters there are 16,777,216 possible combina- 60 
tions of the twenty four bits in the three bytes. In the 
preferred embodiment of the invention described herein, a 
sixteen bit, two byte delimiter is used. 

FIG. 3 is a block diagram of the subroutine that is used to 
calculate the Delimiter Selection Frofik Table (DSP'O. At 65 
the beginning of the routine (block 301), the entries in the 
DSPT are initialized to "0". Next the file pointer (i.e. the 



4 

pointer to the location in the file being analyzed) is set to 0 
(block 302). In the preferred embodiment used herein the 
delimiter length is *nwo". After the parameters are initialized 
the file being analyzed is opened (block 304). The file 
pointer is next set to a value which points to the beginning 
of the file (block 310) and a number of bytes equal to the 
length of the delimiter is read. 

This is the beginning of a loop and if the end of the file 
is detected, the process goes to a termination routine (block 
318). If this is not the end of the file, the two bytes which 
were read arc used as an index to access one line of the 
DSPT (block 312). Note the outline of the DSPT is shown 
in FIG. 2D. 'Ilie accessed line in the DSPT is then updated 
by: 

a) Incrementing the number of occurrences, 

b) The length of the segment is computed by using the 
"Offset of the previous occupance" from the last col- 
umn of the DSPT. If the computed length is longer than 
the previous "longest length", the computed length is 
put in column 2 of the DSPT 

c) 'Die ofiset in the previous occurrence column of the 
DSPT is updated by placing the present value of the file 
pointer in this field. 

Once the line in the DSPT is updated, the file pointer is 
incremented (block 316) and the loop is repeated by return- 
ing to block 310 which reposition the file pointer. Two bytes 
are them read in. Note, the file pointer had only been 
incremented by one, hence one of the bytes which is read in 
is the same byte as was previously read in. 

When the end of the file is reached the process goes to 
block 318 which closes the file and to block 319 where the 
data in the DSPT is used to select a delimiter. A variety of 
algorithms could be used to select a delimiter. However, in 
the preferred embodiment described herein, the delimiter is 
selected by first going through the DSPT and choosing the 
first delimiter where the number of occurrences is greater 
than 50 and the average length between delimiters is greater 
than 1,000 and less than 30,000. If on the first pass through 
the DSPT no delimiter is found which meets the selection 
criteria, the selection criteria is eased. For example on the 
second pass one would try to find a delimiter where the 
number of occurrences is greater than 40 and the average 
length between delimiters is greater than 900 and less than 
25,000. The various selection criteria can at first be chosen 
relatively arbitrarily. While delimiters chosen arbitrarily will 
in fact work, experience will show the particular criteria that 
works most satisfactorily with each type of file. 

It is noted that the delimiter selection process can either 
take place on the new file which is located on the sending 
computer 1 or it can take place on the old file which is on 
receiving computer 2. For the preferred embodiment of the 
invention shown herein, the delimiter selection process takes 
place by analyzing the new file on the sending computer 1. 

Once the delimiter has been selected, it is transferred to 
the cxjmputer which did not perform the selection prcx:ess. 
Since in this embodiment the selection process is performed 
by the sending computer, the selected delimiter is then sent 
to the receiving computer. 

The selected delimiter and algorithm shown in FIG. 6 are 
used by the sending computer 1 to analyze the new file and 
by receiving computer 2 to analyze the old file. Prior to 
explaining the process shown in FIG. 6, the Segment Profile 
table (SPT) shown in FIG. 5 will be explained. 

A SPT has one line for each segment in a file, A segment 
is a scries of bytes that is terminated by the selected 
delimiter or by an end of file marker. 

'Die segment profile table gives three numbers for each 
segment. The first number is an offset which tells where the 
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segment begins and the second number is the length of the 
segment. The third number in the SPT relative to each 
segment is a hash number or CRC for the segment. It is 
noted that the hash number used herein is the CRC (cyclical 
redundancy check number) however, it can be anyone of the 5 
well known types of number hash numbers which generate 
a unique identifying number from a series of bits or bytes. 
It is also noted that while a CRC or other hash number is 
described herein as defining a unique segment, mathemati- 
cally duplication or errors are possible. The likely hood of lO 
such errors is so remote (less likely than a failure in the 
typical computer hardware) that they can for the purposes of 
the present invention be considered to be unique. It is also 
noted that to decrease the likelihood of error and to facilitate 
computation, two concatenated CRC number can be used. 35 
The particular technique used to generate hash numbers (e.g. 
CRC numbers) forms no part of the present invention and 
can be conventional. The algorithm in FIG. 6 is used to 
calculate the entries in the SPT for each of the files. The 
sending computer calculates these values for the new file 20 
and the receiving computer calculates these numbers for the 
old file. 

Block 601 shows that the initialization operations include 
the following; 

a) Clear the Segment Profile Table (SFO, that is, set all 25 
entries in the SFF to blanks. 

b) Set the segment number in the first line of the SPT to 

c) Set the file position pointer to "0". 

d) Set the length entry in the first line of the SPT to "0". 

e) Set the CRC entry in the first line of the SPT to "0". 

f) Open the file being analyzed. 

The function of blocks 602, 603, 605 and 608 is the read 
two bytes from the file, to calculate the CRC of these bytes, 
and to determine if the end of file has been reached. Two 
passes through this loop are required to calculate the CRC of 
a two byte delimiter. When the output of block 608 is "yes**, 
the Segment Profile Table (SPT) is updated as shown by 
block 609. The updating operations include: 

a) Accessing a line in the SPT using the segment number 
as an index. 

b) Store the length of the segment and the CRC of the 
segment in the accessed line. 

c) Increment the SPT index, that is, move to the next line 
of the SPT and store the SPT index in the first column 
of the SPT. 

d) Set the oflfeet in the indexed line of the SPT to the value 

of the file position pointer. 5Q 

e) Set the length in the indexed line of the SPT to "0". 

f) Set the CRC in the indeed line of the SPT to "0". 

At the end of the operations shown in block 609, the 
process repeals, that is, the process returns to block 602. 
When block 602 finally detects the end of file condition, the 55 
operations go to block 610 which updates the last line in the 
SPT. 

It is noted that as described herein, the delimiter is 
described as being two bytes long. Alternatively delimiters 
of other lengths could be used. In particular, in specialized 60 
applications, delimiters lengths specific to the application 
could be used. 

It is noted that as described above, computer 2 has a single 
"old file" 23. In most personal computers or work stations, 
a large number of files are stored. As a first step in the 65 
transfer process, one of the files on computer 2 is selected as 
the "old file" 23. This is done by transmitting to computer 2 
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from computer 1, the file name and file type of the tile that 
is being transmitted. A file on computer 2 is selected to be 
"old file" 23 according to the following priorities. 

1) Select a file with the same file name and the same file 
type. If no such file, 

2) Select a file with the same file name. If no such file, 

3) Select a file with the same file type and where the 
names diflfer by no more than two characters. If no such 
file, 

3) Select a file with the same file type. If no such file, 

4) Select any file on an arbitrary basis. 

While the invention has been shown and described with 
respect to particular embodiments thereof, it will be under- 
stood by those skilled in the art that a variety of changes in 
form and detail may be made without departing from the 
spirit and scope of the invention. The invention is limited 
solely by the appended claims. 

While in the preferred embodiment described above, a file 
of the type used in DOS based computers was transferred 
from computer 1 to computer 2, it should be understood that 
in alternative embodiments, the invention could be used to 
transfer other types of "files" between computers. For 
example the invention could be used to transfer a particular 
string of bytes from one computer to a second computer. 
Thus, it should be understood that the present invention can 
be used to transfer any string of bytes (herein termed a 
"file") from one computer to a second computer. 

It is also noted that the segments used in the above 
described preferred embodiment are 128 bytes long. Longer 
or shorter segments could be selected depending upon the 
particular nature of the files being transmitted. Furthermore 
the segment length could be made dependent on various 
factors such as whether there is a file in computer 2 with the 
same file name and file type as the file in the sending 
computer or the file type of the file begin transferred. 

It is noted that as described the file is divided into fixed 
length segments, alternative techniques for dividing the file 
into segments could also be used. It is also noted that as 
described herein the files are first divided into segments, the 
SPTs are generated and then the segments where there is no 
matching segment in the old file are transferred. In alterna- 
tive embodiments, while the SPTs are being generated, some 
parts of the file could be transferred in order to save some 
additional time. 

We claim: 

I. A method of transferring a first file from a first 
computer to a second computer, where a second file is stored 
on said second computer comprising the steps of: 

a) analyzing one of said files to generate a Delimiter 
Selection Profile Table (DSPT), 

b) selecting a delimiter based on said DSPT, 

c) analyzing both the first and the second files to generate 
a first Segment Profile Table (first SPT) and a second 
Segment Profile Table (.second SPT) for segments 
defined by the selected delimiter for both the first and 
the second files, 

d) sending to the first computer the second SPT, 

e) sending from the first computer to the second computer 
a first set of segments of said first file which do not have 
a matching entry of the first and second SPT, and 

f) combining at the second computer the segments in said 
first set of segments, with a second set of segments 
from the second file which have corresponding entries 
in said first and second SPT, 

whereby said first file is quickly transferred to said second 
computer. 
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2. The method recited in claim 1 wherein said files consist 
of a series of bytes and where said delimiter is two bytes 
long. 

3. The method recited in claim 1 wherein said second file 

is analyzed to generate said DSPT. 5 

4. A system for transferring a first file a first computer to 
a second computer, where a second file is stored on said 
second computer comprising: 

a) means for analyzing one of said files to generate a 
Dehmiter Selection Profile Table (DSPT), lO 

b) means for selecting a delimiter based on said DSE*T, 

c) means for analyzing both the first and the second files 
to generate a first Segment Profile Table (first SPT) and 
a second Segment Profile Table (second SPT) for 
segments defined by the selected delimiter for both the 
first and the second files, 

d) means for sending to the first computer the second SPT, 
c) means for sending from the first computer to the second 

computer a first set of segments of said first file which 20 
do not have a matching entry of the first and second 
SPT, and 

f) means for combining at the second computer the 
segments in said first set of segments, with a second set 
of segments from the second file which have corre- 25 
spending entries in said and second SFr 

whereby said first file is quickly transferred to said second 
computer. 

5. A method of transferring a first file from a first 
computer to a second computer, where a second file is stored 
on said second computer comprising the steps of: 

a) analyzing one of said files to generate a Delimiter 
Selection Profile Table (DSPT), 

b) selecting a delimiter based on said DSPT, 35 

c) analyzing both the first and the second files to generate 
a first Segment Profile Table (finst SPT) and a second 
Segment Profile Table (second SPT) for segments 
defined by the selected delimiter for both the first and 
the second files, 40 

d) sending to the first computer the second SPT, 
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e) sending from the first computer to the second computer 
a first set of segments of said first file which do not have 
a matching entry of the first and second SPT, and 

f) combining at the second computer the segments in said 
first set of segments, with a second set of segments 
from the second file which have corresponding entries 
in said first and second SPT, 

whereby said first file is quickly transferred to said second 

computer; and 
wherein said files consist of a series of bytes and where 

said delimiter is two bytes long. 

6. ITic method recited in claim 5 wherein said second file 
is analyzed to generate said DSPT. 

7. A system for transferring a first file on a first computer 
to a second computer, 

where a second file is stored on said second computer 
comprising 

a) means for analyzing one of said files to generate a 
Delimiter Selection Profile Table (DSPT), 

b) means for selecting a delimiter of at least two bytes 
based on said DSPT, 

c) means for analyzing both the first and the second files 
to generate a first Segment Profile Table (first SPT) and 
a second Segment Profile Table (second SPT) for 
segments defined by the selected delimiter for both the 
first and the second files, 

d) means for sending to the first computer the second SPT, 

e) means for sending from the first computer to the second 
computer a first set of segments of said first file which 
do not have a matching entry of the first and second 
SPT, and 

f) means for combining at the second computer the 
segments in said first set of segments, with a second set 
of segments from the second file which have corre- 
sponding entries in said first and second SPT, 

whereby said first file is quickly transferred to said second 
computer. 

♦ * ♦ * * 
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